{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " ## TensorFlow 模型建立与训练 \n",
    " \n",
    " \n",
    " \n",
    "             模型的构建： tf.keras.Model 和 tf.keras.layers\n",
    "\n",
    "             模型的损失函数： tf.keras.losses\n",
    "\n",
    "             模型的优化器： tf.keras.optimizer\n",
    "\n",
    "             模型的评估： tf.keras.metrics"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Keras 有两个重要的概念： 模型（Model） 和 层（Layer） 。层将各种计算流程和变量进行了封装（例如基本的全连接层，CNN 的卷积层、池化层等），而模型则将各种层进行组织和连接，并封装成一个整体，描述了如何将输入数据通过各种层以及运算而得到输出。在需要模型调用的时候，使用 y_pred = model(X) 的形式即可。Keras 在 tf.keras.layers 下内置了深度学习中大量常用的的预定义层，同时也允许我们自定义层。\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 模型构建 tf.keras.layers 和tf.keras.Model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "'''\n",
    "Keras 模型以类的形式呈现，我们可以通过继承 tf.keras.Model 这个 Python 类来定义自己的模型。在继承类中，\n",
    "我们需要重写 __init__() （构造函数，初始化）和 call(input) （模型调用）两个方法，同时也可以根据需要增加自定义的方法。\n",
    "'''\n",
    "class MyModel(tf.keras.Model):\n",
    "    def __init__(self):\n",
    "        super().__init__()     # Python 2 下使用 super(MyModel, self).__init__()\n",
    "        # 此处添加初始化代码（包含 call 方法中会用到的层），例如\n",
    "        # layer1 = tf.keras.layers.BuiltInLayer(...)\n",
    "        # layer2 = MyCustomLayer(...)\n",
    "\n",
    "    def call(self, input):\n",
    "        # 此处添加模型调用的代码（处理输入并返回输出），例如\n",
    "        # x = layer1(input)\n",
    "        # output = layer2(x)\n",
    "        return output\n",
    "\n",
    "    # 还可以添加自定义的方法\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 线性模型例子"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "\n",
    "X = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])\n",
    "y = tf.constant([[10.0], [20.0]])\n",
    "\n",
    "\n",
    "class Linear(tf.keras.Model):\n",
    "    def __init__(self):\n",
    "        super().__init__()\n",
    "        self.dense = tf.keras.layers.Dense(\n",
    "            units=1,\n",
    "            activation=None,\n",
    "            kernel_initializer=tf.zeros_initializer(),\n",
    "            bias_initializer=tf.zeros_initializer()\n",
    "        )\n",
    "\n",
    "    def call(self, input):\n",
    "        output = self.dense(input)\n",
    "        return output\n",
    "\n",
    "\n",
    "# 以下代码结构与前节类似\n",
    "model = Linear()\n",
    "optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)\n",
    "for i in range(100):\n",
    "    with tf.GradientTape() as tape:\n",
    "        y_pred = model(X)      # 调用模型 y_pred = model(X) 而不是显式写出 y_pred = a * X + b\n",
    "        loss = tf.reduce_mean(tf.square(y_pred - y))\n",
    "    grads = tape.gradient(loss, model.variables)    # 使用 model.variables 这一属性直接获得模型中的所有变量\n",
    "    optimizer.apply_gradients(grads_and_vars=zip(grads, model.variables))\n",
    "print(model.variables)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 多层感知机"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class MNISTLoader():\n",
    "    def __init__(self):\n",
    "        mnist = tf.keras.datasets.mnist\n",
    "        (self.train_data, self.train_label), (self.test_data, self.test_label) = mnist.load_data()\n",
    "        # MNIST中的图像默认为uint8（0-255的数字）。以下代码将其归一化到0-1之间的浮点数，并在最后增加一维作为颜色通道\n",
    "        self.train_data = np.expand_dims(self.train_data.astype(np.float32) / 255.0, axis=-1)      # [60000, 28, 28, 1]\n",
    "        self.test_data = np.expand_dims(self.test_data.astype(np.float32) / 255.0, axis=-1)        # [10000, 28, 28, 1]\n",
    "        self.train_label = self.train_label.astype(np.int32)    # [60000]\n",
    "        self.test_label = self.test_label.astype(np.int32)      # [10000]\n",
    "        self.num_train_data, self.num_test_data = self.train_data.shape[0], self.test_data.shape[0]\n",
    "\n",
    "    def get_batch(self, batch_size):\n",
    "        # 从数据集中随机取出batch_size个元素并返回\n",
    "        index = np.random.randint(0, self.num_train_data, batch_size)\n",
    "        return self.train_data[index, :], self.train_label[index]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class MLP(tf.keras.Model):  \n",
    "    def __init__(self):\n",
    "        super().__init__()\n",
    "        self.flatten = tf.keras.layers.Flatten()    # Flatten层将除第一维（batch_size）以外的维度展平\n",
    "        self.dense1 = tf.keras.layers.Dense(units=100, activation=tf.nn.relu)\n",
    "        self.dense2 = tf.keras.layers.Dense(units=10)\n",
    "\n",
    "    def call(self, inputs):         # [batch_size, 28, 28, 1]\n",
    "        x = self.flatten(inputs)    # [batch_size, 784]\n",
    "        x = self.dense1(x)          # [batch_size, 100]\n",
    "        x = self.dense2(x)          # [batch_size, 10]\n",
    "        output = tf.nn.softmax(x)\n",
    "        return output"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 模型的训练： tf.keras.losses 和 tf.keras.optimizer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#定义一些模型超参数：\n",
    "num_epochs = 5\n",
    "batch_size = 50\n",
    "learning_rate = 0.001"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = MLP()\n",
    "data_loader = MNISTLoader()\n",
    "optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "然后迭代进行以下步骤：\n",
    "\n",
    "      1. 从 DataLoader 中随机取一批训练数据；\n",
    "\n",
    "      2. 将这批数据送入模型，计算出模型的预测值；\n",
    "\n",
    "      3. 将模型预测值与真实值进行比较，计算损失函数（loss）。这里使用 tf.keras.losses 中的交叉熵函数作为损失函数；\n",
    " \n",
    "      4.计算损失函数关于模型变量的导数；\n",
    "\n",
    "      5.将求出的导数值传入优化器，使用优化器的 apply_gradients 方法更新模型参数以最小化损失函数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "num_batches = int(data_loader.num_train_data // batch_size * num_epochs)\n",
    "for batch_index in range(num_batches):\n",
    "    X, y = data_loader.get_batch(batch_size)\n",
    "    with tf.GradientTape() as tape:\n",
    "        y_pred = model(X)\n",
    "        loss = tf.keras.losses.sparse_categorical_crossentropy(y_true=y, y_pred=y_pred)\n",
    "        loss = tf.reduce_mean(loss)\n",
    "        print(\"batch %d: loss %f\" % (batch_index, loss.numpy()))\n",
    "    grads = tape.gradient(loss, model.variables)\n",
    "    optimizer.apply_gradients(grads_and_vars=zip(grads, model.variables))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\"\"\"\n",
    "\n",
    "在 tf.keras 中，有两个交叉熵相关的损失函数 tf.keras.losses.categorical_crossentropy 和 \n",
    "tf.keras.losses.sparse_categorical_crossentropy 。其中 sparse 的含义是，\n",
    "真实的标签值 y_true 可以直接传入 int 类型的标签类别。具体而言：\n",
    "\n",
    "\n",
    "\"\"\"\n",
    "\n",
    "loss = tf.keras.losses.sparse_categorical_crossentropy(y_true=y, y_pred=y_pred)\n",
    "\n",
    "\n",
    "loss = tf.keras.losses.categorical_crossentropy(\n",
    "    y_true=tf.one_hot(y, depth=tf.shape(y_pred)[-1]),\n",
    "    y_pred=y_pred\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 模型的评估： tf.keras.metrics"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sparse_categorical_accuracy = tf.keras.metrics.SparseCategoricalAccuracy()\n",
    "num_batches = int(data_loader.num_test_data // batch_size)\n",
    "for batch_index in range(num_batches):\n",
    "    start_index, end_index = batch_index * batch_size, (batch_index + 1) * batch_size\n",
    "    y_pred = model.predict(data_loader.test_data[start_index: end_index])\n",
    "    sparse_categorical_accuracy.update_state(y_true=data_loader.test_label[start_index: end_index], y_pred=y_pred)\n",
    "print(\"test accuracy: %f\" % sparse_categorical_accuracy.result())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 卷积神经网络（CNN）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class CNN(tf.keras.Model):\n",
    "    def __init__(self):\n",
    "        super().__init__()\n",
    "        self.conv1 = tf.keras.layers.Conv2D(\n",
    "            filters=32,             # 卷积层神经元（卷积核）数目\n",
    "            kernel_size=[5, 5],     # 感受野大小\n",
    "            padding='same',         # padding策略（vaild 或 same）\n",
    "            activation=tf.nn.relu   # 激活函数\n",
    "        )\n",
    "        self.pool1 = tf.keras.layers.MaxPool2D(pool_size=[2, 2], strides=2)\n",
    "        self.conv2 = tf.keras.layers.Conv2D(\n",
    "            filters=64,\n",
    "            kernel_size=[5, 5],\n",
    "            padding='same',\n",
    "            activation=tf.nn.relu\n",
    "        )\n",
    "        self.pool2 = tf.keras.layers.MaxPool2D(pool_size=[2, 2], strides=2)\n",
    "        self.flatten = tf.keras.layers.Reshape(target_shape=(7 * 7 * 64,))\n",
    "        self.dense1 = tf.keras.layers.Dense(units=1024, activation=tf.nn.relu)\n",
    "        self.dense2 = tf.keras.layers.Dense(units=10)\n",
    "\n",
    "    def call(self, inputs):\n",
    "        x = self.conv1(inputs)                  # [batch_size, 28, 28, 32]\n",
    "        x = self.pool1(x)                       # [batch_size, 14, 14, 32]\n",
    "        x = self.conv2(x)                       # [batch_size, 14, 14, 64]\n",
    "        x = self.pool2(x)                       # [batch_size, 7, 7, 64]\n",
    "        x = self.flatten(x)                     # [batch_size, 7 * 7 * 64]\n",
    "        x = self.dense1(x)                      # [batch_size, 1024]\n",
    "        x = self.dense2(x)                      # [batch_size, 10]\n",
    "        output = tf.nn.softmax(x)\n",
    "        return output\n",
    "\"\"\"\n",
    "使用 Keras 中预定义的经典卷积神经网络结构 tf.keras.applications 中有一些预定义好的经典卷积神经网络结构，\n",
    "如 VGG16 、 VGG19 、 ResNet 、 MobileNet 等。\n",
    "我们可以直接调用这些经典的卷积神经网络结构（甚至载入预训练的参数），而无需手动定义网络结构。\n",
    "model = tf.keras.applications.MobileNetV2()\n",
    "\"\"\"\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 循环神经网络（RNN）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "'''\n",
    "首先，还是实现一个简单的 DataLoader 类来读取文本，并以字符为单位进行编码。设字符种类数为 num_chars ，\n",
    "则每种字符赋予一个 0 到 num_chars - 1 之间的唯一整数编号 i。\n",
    "'''\n",
    "\n",
    "class DataLoader():\n",
    "    def __init__(self):\n",
    "        path = tf.keras.utils.get_file('nietzsche.txt',\n",
    "            origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')\n",
    "        with open(path, encoding='utf-8') as f:\n",
    "            self.raw_text = f.read().lower()\n",
    "        self.chars = sorted(list(set(self.raw_text)))\n",
    "        self.char_indices = dict((c, i) for i, c in enumerate(self.chars))\n",
    "        self.indices_char = dict((i, c) for i, c in enumerate(self.chars))\n",
    "        self.text = [self.char_indices[c] for c in self.raw_text]\n",
    "\n",
    "    def get_batch(self, seq_length, batch_size):\n",
    "        seq = []\n",
    "        next_char = []\n",
    "        for i in range(batch_size):\n",
    "            index = np.random.randint(0, len(self.text) - seq_length)\n",
    "            seq.append(self.text[index:index+seq_length])\n",
    "            next_char.append(self.text[index+seq_length])\n",
    "        return np.array(seq), np.array(next_char)       # [batch_size, seq_length], [num_batch]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class RNN(tf.keras.Model):\n",
    "    def __init__(self, num_chars, batch_size, seq_length):\n",
    "        super().__init__()\n",
    "        self.num_chars = num_chars\n",
    "        self.seq_length = seq_length\n",
    "        self.batch_size = batch_size\n",
    "        self.cell = tf.keras.layers.LSTMCell(units=256)\n",
    "        self.dense = tf.keras.layers.Dense(units=self.num_chars)\n",
    "\n",
    "    def call(self, inputs, from_logits=False):\n",
    "        inputs = tf.one_hot(inputs, depth=self.num_chars)       # [batch_size, seq_length, num_chars]\n",
    "        state = self.cell.get_initial_state(batch_size=self.batch_size, dtype=tf.float32)\n",
    "        for t in range(self.seq_length):\n",
    "            output, state = self.cell(inputs[:, t, :], state)\n",
    "        logits = self.dense(output)\n",
    "        if from_logits:\n",
    "            return logits\n",
    "        else:\n",
    "            return tf.nn.softmax(logits)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "num_batches = 1000\n",
    "seq_length = 40\n",
    "batch_size = 50\n",
    "learning_rate = 1e-3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data_loader = DataLoader()\n",
    "model = RNN(num_chars=len(data_loader.chars), batch_size=batch_size, seq_length=seq_length)\n",
    "optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)\n",
    "for batch_index in range(num_batches):\n",
    "    X, y = data_loader.get_batch(seq_length, batch_size)\n",
    "    with tf.GradientTape() as tape:\n",
    "        y_pred = model(X)\n",
    "        loss = tf.keras.losses.sparse_categorical_crossentropy(y_true=y, y_pred=y_pred)\n",
    "        loss = tf.reduce_mean(loss)\n",
    "        print(\"batch %d: loss %f\" % (batch_index, loss.numpy()))\n",
    "    grads = tape.gradient(loss, model.variables)\n",
    "    optimizer.apply_gradients(grads_and_vars=zip(grads, model.variables))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def predict(self, inputs, temperature=1.):\n",
    "    batch_size, _ = tf.shape(inputs)\n",
    "    logits = self(inputs, from_logits=True)\n",
    "    prob = tf.nn.softmax(logits / temperature).numpy()\n",
    "    return np.array([np.random.choice(self.num_chars, p=prob[i, :])\n",
    "                     for i in range(batch_size.numpy())])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#通过这种方式进行 “滚雪球” 式的连续预测，即可得到生成文本。\n",
    "\n",
    "X_, _ = data_loader.get_batch(seq_length, 1)\n",
    "for diversity in [0.2, 0.5, 1.0, 1.2]:\n",
    "    X = X_\n",
    "    print(\"diversity %f:\" % diversity)\n",
    "    for t in range(400):\n",
    "        y_pred = model.predict(X, diversity)\n",
    "        print(data_loader.indices_char[y_pred[0]], end='', flush=True)\n",
    "        X = np.concatenate([X[:, 1:], np.expand_dims(y_pred, axis=1)], axis=-1)\n",
    "    print(\"\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Keras Pipeline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Keras Sequential/Functional API 模式建立模型 \n",
    "最典型和常用的神经网络结构是将一堆层按特定顺序叠加起来，那么，我们是不是只需要提供一个层的列表，就能由 Keras 将它们自动首尾相连，形成模型呢？Keras 的 Sequential API 正是如此。通过向 tf.keras.models.Sequential() 提供一个层的列表，就能快速地建立一个 tf.keras.Model 模型并返回："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = tf.keras.models.Sequential([\n",
    "            tf.keras.layers.Flatten(),\n",
    "            tf.keras.layers.Dense(100, activation=tf.nn.relu),\n",
    "            tf.keras.layers.Dense(10),\n",
    "            tf.keras.layers.Softmax()\n",
    "        ])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "不过，这种层叠结构并不能表示任意的神经网络结构。为此，Keras 提供了 Functional API，帮助我们建立更为复杂的模型，例如多输入 / 输出或存在参数共享的模型。其使用方法是将层作为可调用的对象并返回张量（这点与之前章节的使用方法一致），并将输入向量和输出向量提供给 tf.keras.Model 的 inputs 和 outputs 参数，示例如下："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "inputs = tf.keras.Input(shape=(28, 28, 1))\n",
    "        x = tf.keras.layers.Flatten()(inputs)\n",
    "        x = tf.keras.layers.Dense(units=100, activation=tf.nn.relu)(x)\n",
    "        x = tf.keras.layers.Dense(units=10)(x)\n",
    "        outputs = tf.keras.layers.Softmax()(x)\n",
    "        model = tf.keras.Model(inputs=inputs, outputs=outputs)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 使用 Keras Model 的 compile 、 fit 和 evaluate 方法训练和评估模型 \n",
    "\n",
    "\n",
    "当模型建立完成后，通过 tf.keras.Model 的 compile 方法配置训练过程："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.compile(\n",
    "        optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),\n",
    "        loss=tf.keras.losses.sparse_categorical_crossentropy,\n",
    "        metrics=[tf.keras.metrics.sparse_categorical_accuracy]\n",
    "    )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 接下来，可以使用 tf.keras.Model 的 fit 方法训练模型：\n",
    "  model.fit(data_loader.train_data, data_loader.train_label, epochs=num_epochs, batch_size=batch_size)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "tf.keras.Model.fit 接受 5 个重要的参数：\n",
    "\n",
    "x ：训练数据；\n",
    "\n",
    "y ：目标数据（数据标签）；\n",
    "\n",
    "epochs ：将训练数据迭代多少遍；\n",
    "\n",
    "batch_size ：批次的大小；\n",
    "\n",
    "validation_data ：验证数据，可用于在训练过程中监控模型的性能。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#最后，使用 tf.keras.Model.evaluate 评估训练效果，提供测试数据及标签即可：\n",
    "print(model.evaluate(data_loader.test_data, data_loader.test_label))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 自定义层、损失函数和评估指标"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#自定义层需要继承 tf.keras.layers.Layer 类，并重写 __init__ 、 build 和 call 三个方法，如下所示：\n",
    "\n",
    "    def __init__(self):\n",
    "        super().__init__()\n",
    "        # 初始化代码\n",
    "\n",
    "    def build(self, input_shape):     # input_shape 是一个 TensorShape 类型对象，提供输入的形状\n",
    "        # 在第一次使用该层的时候调用该部分代码，在这里创建变量可以使得变量的形状自适应输入的形状\n",
    "        # 而不需要使用者额外指定变量形状。\n",
    "        # 如果已经可以完全确定变量的形状，也可以在__init__部分创建变量\n",
    "        self.variable_0 = self.add_weight(...)\n",
    "        self.variable_1 = self.add_weight(...)\n",
    "\n",
    "    def call(self, inputs):\n",
    "        # 模型调用的代码（处理输入并返回输出）\n",
    "        return output"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#自定义全连接层（ tf.keras.layers.Dense ），\n",
    "class LinearLayer(tf.keras.layers.Layer):\n",
    "    def __init__(self, units):\n",
    "        super().__init__()\n",
    "        self.units = units\n",
    "\n",
    "    def build(self, input_shape):     # 这里 input_shape 是第一次运行call()时参数inputs的形状\n",
    "        self.w = self.add_variable(name='w',\n",
    "            shape=[input_shape[-1], self.units], initializer=tf.zeros_initializer())\n",
    "        self.b = self.add_variable(name='b',\n",
    "            shape=[self.units], initializer=tf.zeros_initializer())\n",
    "\n",
    "    def call(self, inputs):\n",
    "        y_pred = tf.matmul(inputs, self.w) + self.b\n",
    "        return y_pred"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#在定义模型的时候，我们便可以如同 Keras 中的其他层一样，调用我们自定义的层 LinearLayer：\n",
    "class LinearModel(tf.keras.Model):\n",
    "    def __init__(self):\n",
    "        super().__init__()\n",
    "        self.layer = LinearLayer(units=1)\n",
    "\n",
    "    def call(self, inputs):\n",
    "        output = self.layer(inputs)\n",
    "        return output"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 自定义损失函数和评估指标\n",
    "\n",
    "自定义损失函数需要继承 tf.keras.losses.Loss 类，重写 call 方法即可，输入真实值 y_true 和模型预测值 y_pred ，输出模型预测值和真实值之间通过自定义的损失函数计算出的损失值。下面的示例为均方差损失函数：\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class MeanSquaredError(tf.keras.losses.Loss):\n",
    "    def call(self, y_true, y_pred):\n",
    "        return tf.reduce_mean(tf.square(y_pred - y_true))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "自定义评估指标需要继承 tf.keras.metrics.Metric 类，并重写 __init__ 、 update_state 和 result 三个方法。下面的示例对前面用到的 SparseCategoricalAccuracy 评估指标类做了一个简单的重实现："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class SparseCategoricalAccuracy(tf.keras.metrics.Metric):\n",
    "    def __init__(self):\n",
    "        super().__init__()\n",
    "        self.total = self.add_weight(name='total', dtype=tf.int32, initializer=tf.zeros_initializer())\n",
    "        self.count = self.add_weight(name='count', dtype=tf.int32, initializer=tf.zeros_initializer())\n",
    "\n",
    "    def update_state(self, y_true, y_pred, sample_weight=None):\n",
    "        values = tf.cast(tf.equal(y_true, tf.argmax(y_pred, axis=-1, output_type=tf.int32)), tf.int32)\n",
    "        self.total.assign_add(tf.shape(y_true)[0])\n",
    "        self.count.assign_add(tf.reduce_sum(values))\n",
    "\n",
    "    def result(self):\n",
    "        return self.count / self.total"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
